AITopics | frozen lake environment

Collaborating Authors

frozen lake environment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Reinforcement Learning Method for Environments with Stochastic Variables: Post-Decision Proximal Policy Optimization with Dual Critic Networks

Felizardo, Leonardo Kanashiro, Fadda, Edoardo, Brandimarte, Paolo, Del-Moral-Hernandez, Emilio, Nascimento, Mariá Cristina Vasconcelos

arXiv.org Artificial IntelligenceNov-18-2025

This paper presents Post-Decision Proximal Policy Optimization (PDPPO), a novel variation of the leading deep reinforcement learning method, Proximal Policy Optimization (PPO). The PDPPO state transition process is divided into two steps: a deterministic step resulting in the post-decision state and a stochastic step leading to the next state. Our approach incorporates post-decision states and dual critics to reduce the problem's dimensionality and enhance the accuracy of value function estimation. Lot-sizing is a mixed integer programming problem for which we exemplify such dynamics. The objective of lot-sizing is to optimize production, delivery fulfillment, and inventory levels in uncertain demand and cost parameters. This paper evaluates the performance of PDPPO across various environments and configurations. Notably, PDPPO with a dual critic architecture achieves nearly double the maximum reward of vanilla PPO in specific scenarios, requiring fewer episode iterations and demonstrating faster and more consistent learning across different initializations. On average, PDPPO outperforms PPO in environments with a stochastic component in the state transition. These results support the benefits of using a post-decision state. Integrating this post-decision state in the value function approximation leads to more informed and efficient learning in high-dimensional and stochastic environments.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IJCNN64981.2025.11227565

2504.0515

Country:

South America > Brazil (0.14)
Europe > Italy (0.14)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study (0.94)

Industry: Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning

Åström, Hampus, Topp, Elin Anna, Malec, Jacek

arXiv.org Artificial IntelligenceNov-7-2025

In this paper we study how transforming regular reinforcement learning environments into goal-conditioned environments can let agents learn to solve tasks autonomously and reward-free. We show that an agent can learn to solve tasks by selecting its own goals in an environment-agnostic way, at training times comparable to externally guided reinforcement learning. Our method is independent of the underlying off-policy learning algorithm. Since our method is environment-agnostic, the agent does not value any goals higher than others, leading to instability in performance for individual goals. However, in our experiments, we show that the average goal success rate improves and stabilizes. An agent trained with this method can be instructed to seek any observations made in the environment, enabling generic training of agents prior to specific use cases.

machine learning, reinforcement learning, selection, (15 more...)

arXiv.org Artificial Intelligence

2511.04598

Genre: Research Report > New Finding (0.34)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

An Open-source Sim2Real Approach for Sensor-independent Robot Navigation in a Grid

Abrar, Murad Mehrab, Mondal, Souryadeep, Hickner, Michelle

arXiv.org Artificial IntelligenceJan-6-2025

This paper presents a Sim2Real (Simulation to Reality) approach to bridge the gap between a trained agent in a simulated environment and its real-world implementation in navigating a robot in a similar setting. Specifically, we focus on navigating a quadruped robot in a real-world grid-like environment inspired by the Gymnasium Frozen Lake -- a highly user-friendly and free Application Programming Interface (API) to develop and test Reinforcement Learning (RL) algorithms. We detail the development of a pipeline to transfer motion policies learned in the Frozen Lake simulation to a physical quadruped robot, thus enabling autonomous navigation and obstacle avoidance in a grid without relying on expensive localization and mapping sensors. The work involves training an RL agent in the Frozen Lake environment and utilizing the resulting Q-table to control a 12 Degrees-of-Freedom (DOF) quadruped robot. In addition to detailing the RL implementation, inverse kinematics-based quadruped gaits, and the transfer policy pipeline, we open-source the project on GitHub and include a demonstration video of our Sim2Real transfer approach. This work provides an accessible, straightforward, and low-cost framework for researchers, students, and hobbyists to explore and implement RL-based robot navigation in real-world grid environments.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2411.03494

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.90)

Add feedback

Gym Tutorial: The Frozen Lake

#artificialintelligenceNov-13-2020, 14:21:11 GMT

In this article, we are going to learn how to create and explore the Frozen Lake environment using the Gym library, an open source project created by OpenAI used for reinforcement learning experiments. The Gym library defines a uniform interface for environments what makes the integration between algorithms and environment easier for developers. Among many ready-to-use environments, the default installation includes a text-mode version of the Frozen Lake game, used as example in our last post. The first step to create the game is to import the Gym library and create the environment. The next line calls the method gym.make() to create the Frozen Lake environment and then we call the method env.reset() to put it on its initial state.

frozen lake environment, frozen lake game, reinforcement, (12 more...)

#artificialintelligence

Genre: Instructional Material (0.37)

Industry: Leisure & Entertainment (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.41)

Add feedback